-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConfigItem: Fix infinite recursion caused by ignore_on_error
when …
#9406
ConfigItem: Fix infinite recursion caused by ignore_on_error
when …
#9406
Conversation
Please update the PR description to specify the config that was used in each case and clearly state when a process did not terminate when it should or when it crashed. Also you have a typo in the PR title and commit message, you might want to fix that as well. |
ignore_on_error
when …ignore_on_error
when …
e8cd42d
to
4f7edd6
Compare
Updated! However, right now it just crashes, though you can check the referenced issue comment to see this repetition over and over again. You can also test the config snippet in |
a678466
to
0249b36
Compare
One thing I don't yet understand: Minimal example to reproduce this seems to be something like |
The difference between Host and Service objects exists because services are considered "unnamed items" (not because they don't have a name, but at that stage of config loading, the name is not yet known because it depends on the host name which only is known after evaluating the service body). And items are handled differently based on whether they are considered named. If they aren't, committing is tried only once (due to the swap at the end): icinga2/lib/config/configitem.cpp Lines 407 to 421 in 32c7f77
So unregistering named items seems to indeed be the correct thing to do here. |
Please also share the config you used for your performance test.
I don't think the difference in these numbers is plausible for your change on the The change to |
0249b36
to
2282553
Compare
Configobject CheckCommand "dummy" {
import "dummy-check-command"
vars.dummy_state = 0
vars.dummy_text = "Check was successful."
}
template Host "generic-host" {
max_check_attempts = 3
check_interval = 1m
retry_interval = 30s
check_command = "dummy"
}
template Service "generic-service" {
max_check_attempts = 5
check_interval = 1m
retry_interval = 30s
}
object Host NodeName {
import "generic-host"
address = "127.0.0.1"
address6 = "::1"
vars.os = "Linux"
}
for (idx in range(20000)) {
if ((idx % 2) == 0) {
var hostname = "icinga-" + string(idx)
} else {
var hostname = "dummy-" + string(idx)
}
object Host hostname {
import "generic-host"
address = "127.0.0.1"
if (match("*icinga*", this.name)) {
address = "10.211.55.31"
}
vars.is_dummy = true
}
}
for (i in range(10)) {
apply Service "random" + i {
import "generic-service"
check_command = "dummy"
assign where true
}
} Tests: Before~/Workspace/icinga2 (bugfix/handle-ignore-on-error-uses-cases-properly ✗) time prefix/sbin/icinga2 daemon -C
[2022-09-07 11:07:43 +0200] information/cli: Icinga application loader (version: v2.13.0-429-g86b63a57a; debug)
[2022-09-07 11:07:43 +0200] information/cli: Loading configuration file(s).
[2022-09-07 11:07:44 +0200] information/ConfigItem: Committing config item(s).
[2022-09-07 11:07:44 +0200] information/ApiListener: My API identity: yonass-mbp.int.netways.de
[2022-09-07 11:07:54 +0200] information/WorkQueue: #4 (DaemonUtility::LoadConfigFiles) items: 0, rate: 2.2/s (132/min 132/5min 132/15min);
[2022-09-07 11:07:54 +0200] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2022-09-07 11:07:54 +0200] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 CheckCommand.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 ApiUser.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 Endpoint.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 20001 Hosts.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 3 Zones.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 1 IcingaDB.
[2022-09-07 11:10:52 +0200] information/ConfigItem: Instantiated 200010 Services.
[2022-09-07 11:10:52 +0200] information/ScriptGlobal: Dumping variables to file '/Users/yhabteab/Workspace/icinga2/prefix/var/cache/icinga2/icinga2.vars'
[2022-09-07 11:10:52 +0200] information/cli: Finished validating the configuration file(s).
prefix/sbin/icinga2 daemon -C 494,60s user 9,12s system 261% cpu 3:12,90 total After~/Workspace/icinga2 (bugfix/handle-ignore-on-error-uses-cases-properly ✗) time prefix/sbin/icinga2 daemon -C
[2022-09-07 11:20:49 +0200] information/cli: Icinga application loader (version: v2.13.0-431-g22825535d; debug)
[2022-09-07 11:20:49 +0200] information/cli: Loading configuration file(s).
[2022-09-07 11:20:51 +0200] information/ConfigItem: Committing config item(s).
[2022-09-07 11:20:51 +0200] information/ApiListener: My API identity: yonass-mbp.int.netways.de
[2022-09-07 11:21:01 +0200] information/WorkQueue: #4 (DaemonUtility::LoadConfigFiles) items: 0, rate: 2.2/s (132/min 132/5min 132/15min);
[2022-09-07 11:21:01 +0200] information/WorkQueue: #6 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2022-09-07 11:21:01 +0200] information/WorkQueue: #7 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 IcingaDB.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 CheckCommand.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 ApiUser.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 Endpoint.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 20001 Hosts.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 3 Zones.
[2022-09-07 11:23:39 +0200] information/ConfigItem: Instantiated 200010 Services.
[2022-09-07 11:23:39 +0200] information/ScriptGlobal: Dumping variables to file '/Users/yhabteab/Workspace/icinga2/prefix/var/cache/icinga2/icinga2.vars'
[2022-09-07 11:23:39 +0200] information/cli: Finished validating the configuration file(s).
prefix/sbin/icinga2 daemon -C 482,10s user 7,73s system 282% cpu 2:53,49 total |
…ommitting an item When committing an item with `ignore_on_error` flag set fails, the `Commit()` method only returns `nullptr` and the current item is not being dropped from `m_Items`. `CommittNewItems()` also doesn't check the return value of `Commit()` but just continues and tries to commit all items from `m_Items` in recursive call. Since this corrupt item is never removed from `m_Items`, it ends up in an endless recursion till it finally crashes.
This also improves the performance a bit, as we longer have to iterate over the items and copy them into the new items vector.
2282553
to
400117e
Compare
By the way, I didn't really consider the claimed performance improvement in the review as this PR isn't about performance and it doesn't look like it makes things slower. But I invite you to try if you can actually reproduce your numbers, i.e. run your test a few more times and see if one is indeed consistently faster. |
…committing an item
When committing an item with
ignore_on_error
flag set fails, theCommit()
method only returnsnullptr
and the current item is not being dropped from
m_Items
.CommittNewItems()
also doesn't check the returnvalue of
Commit()
but just continues and tries to commit all items fromm_Items
in recursive call. Sincethis corrupt item is never removed from
m_Items
, it ends up in an endless recursion till it finally crashes.Test config
Before
see #8824 (comment)
And....:
Performance
Check this out:
After
Performance
fixes #8824